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Abstract 

In this paper, a new algorithm to solve subset regression problems is described, called 
the minimal residual QR-factorization algorithm (MRQR). This scheme performs a QR 
factorization with a new column-pivoting strategy. Basically, this strategy is based on 
the change in the residual of the least-squares problem. 

Furthermore, it is demonstrated that this basic scheme might be extended in a 
numerically efficient way to combine the advantages of existing numerical procedures, 
such as the singular value decomposition, with those of more classical statistical pro- 
cedures, such as stepwise regression. In this paper this extension is presented as an 
advisory-expert system that guides the user in solving the subset regression problem. 

The advantages of the new procedure are highlighted by a numerical example. 


1 Introduction 


The least-squares problem considered in this paper is formulated as 

min ||Ax - 6||2 (1) 

X 

where A € R mxn (m > n) ,b € R m and ||.||2 denotes the Euclidean norm of a vector. 

For a number of applications, equation (1) is not solvable because of either the 
redundancy in the specified columns of the A-matrix in (1) or the specification of columns 
bearing “little” relationship with the right hand side (rhs) b in (1). These two types of 
difficulties have been treated in the literature along two different lines: The first difficulty 
could reliably be treated and analyzed using the singular- value decomposition (SVD) [9]. 
An attractive way to treat the second difficulty is via the stepwise regression technique 
(SRT) [12], 

In practice, these two difficulties might arise in combination. Therefore, this certainly 
provides motivation for a procedure that handles both difficulties in a reliable way. For 
example, in the aerodynamic model identification problem [2], [11], this (new) procedure 
should address the following problem formulation 
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Select the minimal number of columns of A which maximally “explain” the rhs 
b (or which yield the smallest residual). 

This particular problem is analyzed in this paper and referred to as the subset regression 
problem (SRP). 

For this type of SRP, existing solutions have a number of major drawbacks. The 
drawbacks of the SVD are: 

1 . It does not perform a subset selection from the originally defined parameter vector x in 
(1), but generally finds estimates for all its components. This violates the minimality 
condition in the SRP formulation. To address this problem, an attempt to extend 
the capabilities of the SVD is made in [3]. Here additional use is made of the QR 
factorization with column pivoting as described in [8]. This not only increases the 
computational complexity of the overall procedure, but a number of examples remain 
where this pivoting strategy fails (e.g., in [4] pp. 791-792). 

2. In [5] it was demonstrated that the use of the SVD in solving least-squares problems 
generally requires different column-scaling procedures which often are contradictory. 

3. The e-numerical rank, defined in [3], which is a key element in the solution based 
on the SVD, is determined based only on the A-matrix. The rhs b is not included 
in this process. However, in solving so-called “ill-defined, rank-deficient least squares 
problems” 1 the rhs b might supply the crucial information. 

The SRT may be considered as the most suitable technique presently available to 
address the SRP. It certainly overcomes a number of the drawbacks of the SVD; however, 
its drawbacks are: 

1. Numerical robustness may be limited because of the use of the normal equations. 

2. Even if its key decision parameters used in selecting individual columns of A, which 
often are taken as the partial F-ratios, are computed reliably, this selection process 
might become indefinite or lead to an ill-conditioned solution, as demonstrated in this 
paper. 

In this paper, a new procedure is described that overcomes the drawbacks of the 
techniques outlined above. The basic algorithm in this procedure is called the Minimal 
Residual QR factorization algorithm (MRQR). This algorithm computes a QR factorization 
of A with a new column-pivoting strategy. The column pivoting is based on the change in 
residual of the least squares problem caused by the selection of the different “candidate” 
columns of A. It is shown that this technique can be used as a robust implementation 
of SRT. A description of the MRQR is given in section 2 and implementation aspects are 
briefly discussed in section 3. The extension of the MRQR is presented in this paper as an 
“advisory-expert” system; this is discussed in section 4. Some of the advantages of this new 
procedure are demonstrated by a numerical example in section 5. Finally, the concluding 
remarks are presented in section 6. 

1 Ill-defined, rank-deficient least squares problems arise when in the set of singular values of the A-matrix, 
ordered in decreasing magnitude, “small” values occur (relative to the errors on the entries of the ,4-matrix); 
however, there is no clear “gap” present between any set of two subsequent singular values. 
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2 A description of the MRQR 


In this section an outline of the Minimal Residual QR algorithm is given in pseudo- 
programming-language form. The precise algorithmic details of computing the different 
parts of the core of the algorithm are discussed in the following section. 

The MRQR 

INITIALIZATION 

rank 2 = n 

A = A 1 = [a}---oJ_ 1 ojo} +1 ---a^] b = b 1 (2) 

DO i = l:n, 

STEP 1 . Select the column vector of A 1 “most closely” related to b ' , 

=> denote this column vector as o’ 

STEP 2. Interchange the j th and t th column vector of A * by 
the column permutation matrix tt, 

STEP 3. Perform an orthogonal projection Qi, such that 



STEP 4. Rank determination test: 


This can more generally be referred to as the “termination test” and here dif- 
ferent methods may be desirable to terminate the MRQR depending on the 
goal of our SRP. 

a) Numerically we could be interested in the e-numerical rank of the 
matrix A [3]. 

b) Practically we could be interested in whether the residual of our least 
squares problem is sufficiently small or the remaining “candidate” col- 
umns of A are related to the rhs in a (statistically) insignificant way. 


END: 

STOP: 


The main part of this algorithm are the four steps in the “DO-loop”. A combined 
execution of these four steps is referred to as a “sweep” of the MRQR algorithm. 

2 The notion of rank in the MRQR may be considered for our purposes as the (minimal) number of 
columns of A that yield the minimal residual of the least squares problem (additional comments are given 
in STEP 4 of the DO-loop) 
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3 Implementation of the MRQR algorithm 


A crucial task in the MRQR scheme is the selection of that column of A x “most 
closely” related to the rhs vector fc\ This task is formulated in the first step of each sweep 
of the MRQR algorithm. A possible way to put this formulation into a mathematical 
framework is as follows: 


Find the column vector of A 1 having the minimal residual when solving the 
following sub-least squares problem: 

nun ||aj£- 6'|| (4) 


for each column vector of A 1 . 


When using this problem formulation, the first step is to first compute these residuals. 
The residuals resulting from solving the set of problems defined in (4) are denoted by ||mj||. 
The second step is to find the minimum of the obtained sequence {||m^|| for t = t : n}. The 
calculation of the residuals when the entries of A are noise-free is addressed next. 


For any set of two vectors {a^fc 1 }, that define problem (4), the residual is the length 
of the component of b % orthogonal to a\. This component is denoted in Fig. 1 by m x t 

There are a number of known procedures for computing ||m*||2* A first procedure is 
simply based on the sine-rule of a rectangular triangle. In this case, 1 1 1 1 2 (see Fig. 1) can 
be expressed as 

ll"*<lla = (5) 

or it can be written as 

Kill = - cos 2 *;) (6) 

The quantities sin <f>\ and cos <j>\ in (5) and (6) may be calculated as follows: 


sin <f>\ 


114 x y'H, 

ii4ii#ii 2 


(7) 


cos <j>£ — 



( 8 ) 


Equation (8) can be computed rather straigtforwardly and substituted in (6). On the other 
hand, the cross-product in (7) requires some further explanation. A possible way to evaluate 
that cross-product is as follows: 


|| a\ x 6* ||| = det(< a\ b x >'< a\ b x >) 


(9) 


A reliable procedure to find the determinant in (9) can use either the SVD or QR decom- 
position of < a\ b x >, since either gives 



where s, denotes the i th singular value and the »’ th diagonal element of the upper trian- 
gular factor of the SVD and QR decomposition, respectively, of < a\ b* >. 

A second procedure is to compute the residual ||mj ||2 directly from solving the sub- 
least squares problems stated in (4) for all columns of the A'-matrix. For example, this can 
be done by means of a QR decomposition of the matrix < aj 6- > [8], 

From the set of residuals {||mj|| 2 , • • • , 1 1 m ^ j 1 2 } of the different column vectors of A', we 
now have to select the minimum. The column corresponding to this minimum is represented 
by a‘ in STEP 1 of the DO-loop described in section 2. 

4 Extending the MRQR to an advisory-expert system 

4.1 Difficulties with the column selection procedure in the MRQR 

As outlined in section 3, the key parameters in the column selection procedure of the 
MRQR are the set of residuals { [ j 1 1 2 > * * • > 1 1 1 1 2 } - In [1] it has been demonstrated that 

these residuals are related in a very simple manner to the statistical partial F-ratios used 
in the SRT. Therefore, the MRQR might be used as a robust implementation of the SRT. 

However, even if these F-ratios or residuals can be reliably computed, the following 
difficulties might arise: 

1. The set of residuals {||m| H 2 , • •* , ||mjj| 2 } might have several minima, causing the col- 
umn selection process of the MRQR to become indefinite (an example of this phe- 
nomenon is given in section 5). 

2. The same set of residuals might show a clear minimum, say ||m* || 2 ; however, the 
inclusion of the corresponding column vector of A leads to a very badly conditioned 
submatrix T of (3). 

A possible way to overcome this set of difficulties is to reformulate the selection 
procedure into the following “min-max” formulation: 


Find the column vector a’ from the candidate columns ,aj,} which has 

a minimal residual and which leads to the maximal, smallest singular value of 
the matrix in (3). 

The selection of the column a* as defined in the min-max formulation corresponds to finding 
the minimum of the following new sequence: 


{ 


Klla 


igyi } 


( 11 ) 


where a\ corresponds to the i-th singular value of the matrix 
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Remark 1: The above min-max formulation is the “exact” mathematical or algorithmic 
translation of the objective of the SRP stated in the introduction. 

Remark 2: The quantities a\ of (11) can be computed individually by the inverse iteration 
method [7]. This calculation becomes efficient when operating only on the upper triangular 

. ( | * \ 

V o I Hh ) 

From this discussion it is clear that the difficulties just described allow (or force) us 
to incorporate additional decision parameters in the column selection process. Furthermore, 
it was demonstrated that one such additional parameter, namely the smallest singular value 
o\ in (11), can efficiently and reliably be calculated during the operation of the MRQR. 

Other decision parameters can be derived in a similar way. (l) Additional statistical 
parameters can be computed, such as the partial-correlation coefficients as demonstrated in 
[6] or the set of collinearity coefficients defined in [5]. (2) Additional numerical parameters 
that might be derived are an estimate of the condition number of the upper triangular 
matrix defined in remark 3, or simply the distance of the remaining candidate columns to 
the already-selected columns of A (as is given by the quantity ||a ^||2 using the notation in 

( 3 ))- 


These different parameters not only overcome the difficulties of the original column- 
selection process in STEP 1 of the DO-loop of the MRQR, but also allow us to realize both 
termination schemes or a “trade-off” between them as was desired in STEP 4 of the same 
DO-loop. 

The challenge we are now facing is to combine and use all or some of these decision 
parameters. This can be done by the proposition of an advisory-expert system that is 
described in the next section. 


4.2 The advisory-expert system for solving SRP 

In the proposed advisory-expert system, more than two decision parameters could 
be incorporated in the column-selection of the MRQR by making this algorithm interactive. 

A decision screen is used as depicted in Fig. 2. This decision screen consists of two 
levels , which could be represented either simultanously or subsequently upon request of the 
user. The first level presents the residual sequence { 1 1 m J 1 1 2 , • • • , Hmjjfe}. In Fig. 2 this 
information is first presented graphically, making it easy to scan these quantitiesjthen the 
additional numerical and statistical parameters are presented. This information is presented 
in Fig. 2 in a tabular form. 

The main advantage of presenting the decision parameters in this format is that this 
information “only” advises the user on which column to select and that it is actually the 
user who makes that selection. Therefore, this present system makes it possible to use the 
“expertise” of the user and furthermore allows for the incorporation of an important class 
of information often available in practical applications [11], [2] referred to as “a priori” 
information. 
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5 A numerical example 


The implementation of the MRQR as well as the derived advisory-expert system 
outlined respectively in sections 2, 3, and 4 are now evaluated via a numerical example. 
This example is performed on a VAX 780 computer in double precision and is taken from 
[10], p. 219. Here, a least-squares problem is stated 



( 1 21 41 61 > 


4.95 \ 


2 22 42 62 


5 

minj; || 

3 23 43 63 

x - 

5 


^ 20 40 60 80 j 


^ 5 ; 


By using the column notation of the A-matrix, as defined in (2), we clearly see that its 
e-numtrical rank (with e taken equal to the machine precision) is 2, since 

2ai + cl i + 2 a 4 

° ! = ~~r~ as = —r~ 

The MRQR is now applied to this example. Instead of automatically selecting the 
column of A having the minimal residual, the first level of the advisory-expert system as 
defined in section 4.2 was used to perform the column selection. This (basic) selection 
screen is shown in Fig. 3. From this figure it is obvious that the fourth column of A must 
be selected. The first level of the selection screen for the second sweep is shown in Fig. 4. 
At this time we requested the second level of the advisory-expert system. In the present 
example, the information supplied at this level consisted of the smallest singular value, 
defined in (II), and the distance between the candidate columns of A and the already- 
selected columns. This information is summarized in table 1. From this table, it again 
becomes clear that the original first column of the A-matrix is the correct choice. 




Original column number i 

Decision parameter 

1 

2 

3 

Distance ||of H 2 

21.87 

14.58 

7.29 

Smallest singular value aj 

21.62 

13.36 

5.92 


Table 1: The second level of the second sweep of the advisory-expert system in the 

numerical example. 

Finally, in the third sweep, the statistical parameters as well as the numerical pa- 
rameters considered in the previous sweeps reveal either that the rhs has been explained 
entirely or that the c-numerical rank of the matrix A is 2. Therefore, the algorithm is 
terminated at this point. 


6 Concluding remarks 


In this paper a new algorithmic solution is presented to solve subset regression 
problems (SRP). This solution is called the minimal-residual QR-factorization algorithm 
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(MRQR). It has been demonstrated that the MRQR could be extended to an advisory- 
expert system that also allows analysis of the SRP where the system matrix is ill-conditioned. 
This advisory-expert system combines the following advantages over existing techniques 
such as the singular-value decomposition and stepwise regression techniques. 


1. It is a robust scheme that can also be implemented in a computationally-efficient 
manner. 

2. It allows for the selection of a minimal subset of the originally-defined columns of the 
system matrix that lead to a minimum residual of the least squares problem. This 
has been achieved by the following unique features of the MRQR, that are exploited 
in the advisory-expert system: 

(a) The ability to combine numerical parameters, such as smallest singular values, 
as well as statistical parameters, such as partial F-ratios, in the column-selection 
process. 

(b) The versatility to terminate the algorithm, thereby allowing the user to focus on 
different problems that might arise with practical applications, such as a numer- 
ically ill-conditioned system matrix or candidate columns having no relationship 
(in a statistical sense) with the right-hand side. 


In conclusion, the technique presented in this paper allows solution of the SRP or 
rank-deficient least squares problems where existing solutions fail. An example might be ill- 
defined rank-deficient least squares problems, which have been defined in the introduction. 
This topic is still under full investigation. 
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Figure 3: First level of the first sweep of the advisory-expert system in the numerical 
example. 



Figure 4: First level of the second sweep of the advisory-expert system in the numerical 
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